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Constructing a minimal vertex cover of a graph can be seen as a prototype for a combinatorial 
optimization problem under hard constraints. In this paper, we develop and analyze message passing 
techniques, namely warning and survey propagation, which serve as efficient heuristic algorithms for 
solving these computational hard problems. We show also, how previously obtained results on the 
typical-case behavior of vertex covers of random graphs can be recovered starting from the message 
passing equations, and how they can be extended. 
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I. INTRODUCTION 



The minimal vertex-cover (VC) problem belongs to the most difficult class of optimization problems in graph theory 
. It asks to mark a minimal number of vertices of a graph, such that each edge of the graph is incident to at least 
one of the selected vertices. This problem is known to be NP-hard, which means in particular that all currently known 
algorithms construct minimal vertex covers in a computational time which scales exponentially with the size of the 
graph. The applicability of such exact algorithms is therefore restricted to pretty small graphs of few hundreds of 
vertices. 

There are, however, applications of the vertex covering problem and other, closely related optimization problems 
0, to a huge number of real-world network problems, examples are the monitoring of Internet traffic 4], the 
prevention of denial-of-service attacks Q and immunization strategies in networks 0. Another technically related 
network problem is the one of counting loops in networks, recently analyzed on the basis of statistical-physics methods 
Q. The dimensions of the underlying networks easily exceed the graph sizes treatable with exact algorithms, and 
heuristic methods to construct as small as possible solutions are needed. 

In this paper we set up two message-passing techniques based on the statistical-physics approach to combinatorial 
optimization problems [a, , more precisely based on the cavity method for diluted systems and its algorithmic 
interpretation [Til Il2|. The first of the message passing-techniques, the so-called warning-propagation, is equivalent 
to the Bethe-Peierls iterative scheme and therefore related to the assumption of replica symmetry, the second one is 
a survey propagation algorithm related to one-step replica symmetry breaking. Both algorithms have already been 
formulated for the vertex cover problem by one of the authors in Q , here we go beyond this presentation providing 
both a more elegant setting and a thorough analysis of the algorithmic performance. 

A natural test bed for the proposed algorithms is provided by finite-connectivity random graphs |13| . The typical 
properties of VCs on such graphs have already been analyzed both with rigorous mathematical tools 0, UJ and 
with statistical-physics methods 0, 0, 0, 0, |2?j. We therefore have a pretty complete knowledge about the 
phase diagram of this problem, and can systematically compare the algorithmic performance of the message-passing 
techniques on single, finite, randomly generated graphs to the average behavior in the thermodynamic limit. 

It should also be noted that the vertex cover problem is closely related to a class of lattice glass models 
l23l I24I l25j. In these models, hard particles are to be positioned on a lattice under geometrical packing constraints 
representing hard-core interactions. These models are considered as simple lattice models for the glass transition due 
to geometric frustration, and their closest packing correspond to minimal VCs 0] . 

This paper is organized as follows: the vertex cover problem is defined in Sec. [H] and the concept of cavity graph is 
defined in Sec. IIHI Sec. II VI focuses on the warning propagation algorithm, and its performance and iterative stability 
are analyzed; Sec. focuses on the survey propagation algorithm; and Sec. IVII estimates the minimal vertex cover 
size for a random graph using statistical physics method; finally in Sec. IVIll we conclude this work. 
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II. THE MODEL 

Let us start with the definition of vertex covers. Given is a graph G = (V, E) with N vertices i = 1, N and M 
undirected edges {i,j} = G E connecting pairs of vertices. 

Definitions: A vertex cover (VC) of the graph G is a subset U C V of vertices such that for all edges {i,j} G E, 
at least one end vertex is element of U , i.e. i G U or j G U. A minimal vertex cover of G is a vertex cover of minimal 
cardinality. 

We also denote vertices in U as covered, as well as their incident edges: The set U is a VC iff all edges are covered. 
Determining a minimal vertex cover is one of the basic NP-hard combinatorial problems pj . Its worst-case solution 
time is consequently expected to grow exponentially with the size of the problem instance, here measured by the vertex 
and edge numbers N and M. The problem is equivalent to the problem of constructing a maximum independent set of 
G, and to the problem of finding the maximum clique in the complementary graph of G (where edges and non-edges 
are exchanged). 

The exponential running time of algorithms constructing minimal vertex covers is a serious limitation to practical 
applications: Exact algorithms are able to treat only relatively small sample graphs. It is therefore interesting to 
develop powerful heuristic methods which are able to construct at least close-to-minimal VCs, which may serve as 
reasonable solutions in practical problems. 

In the context of constraint-satisfaction problems (CSP), recently statistical-physics approaches have led to the 
proposal of so-called survey propagation algorithms which are sophisticated message-passing procedures based on the 
cavity method of statistical physics. This type of algorithm was first proposed for the satisfiability problem [Tl| . 
and then extended to graph-coloring |2(J and general CSPs [27|. and is one of the most efficient algorithms in the 
hard-to-solve phase of these problems. 

The vertex cover problem is structurally different from CSPs. Whereas the computational problem of the latter 
results from the existence of a large number of constraints being hard to satisfy simultaneously, the constraints in 
vertex cover - i.e. the need of covering each edge of the graph - can in principle be satisfied very easily by covering 
many vertices. The computational hardness stems from the objective of finding a minimal vertex cover, i.e. from the 
interaction between a high number of local constraints on one side, and the global minimization condition on the 
other side. This leads to a difference in the validation of the output of a heuristic algorithm: Whereas a solution to 
a CSP can be easily checked by testing all constraints, and the problem consists in finding one, it is no problem at 
all to construct a VC, but its minimality can hardly be shown. One can say that the hardness of solving VC stems 
from the fact that the landscape X(U) = \U\ becomes complex over the set of all VCs U (note: not over the set of all 
vertex sub-sets). 

The algorithmic aim is therefore to construct a vertex cover as small as possible in polynomial time for some 
given graph G = (V,E). The central step in this context will be the calculation (or at least approximation) of the 
vertex-dependent number 

_ \{U C V | U is min. VC, i G U}\ 
711 ~ \{U C V I U is min. VC}| ( ' 

which, for every vertex i € V, equals the fraction of minimal vertex covers containing vertex i G V . In probabilistic 
terms, it can be understood as the probability that i is covered in a randomly selected minimal vertex cover. 

Once we know these quantities, we can obviously exploit them algorithmically. We know, e.g., that each vertex 
with 7Tj = 1 belongs to all minimal VCs, and it has to be included into the VC we are aiming to construct. Contrarily, 
vertices i G V with 7Tj = do not appear in any minimal VC, and they have to be excluded from the vertex set 
we are building. The problem is slightly more involved for those vertices having 7r-values different from zero and 
one: They are contained in some vertex covers, but not in others. Since ~Ki gives only a strictly local information, 
we do not know any possible quantitative restriction to the simultaneous assignment of pairs or even larger subsets 
of vertices. If we consider, e.g., one edge {i, j} G E, the joint probability that both vertices are uncovered does not 
equal (1 — 7Ti)(l — iTj) as one might assume naively by considering the vertices to be independent. It equals obviously 
zero due to the vertex-cover constraint for the edge: At least one of the end-vertices has to be covered. This problem 
can be resolved by an iterative decimation process. We select, e.g., a vertex of non-zero it and add it to the VC U to 
be constructed, and delete the vertex as well as all its incident edges from the graph. We than recompute the it from 
the decimated graph, add a new vertex to U and so on, until all edges of G are covered: The vertex set U now forms 
a vertex cover of the graph G. 

There is an obvious algorithmic problem with evaluating the 71^: A naive calculation according to their definition 
would require the prior knowledge of all minimal VCs - which we do not have if we are trying to develop an algorithm 
finding just a single one of them. The way out will be a message passing procedure 0> [2^ which only exchanges 
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local information between neighboring vertex pairs, until these messages reach globally self-consistent values. Such 
message passing procedures first need the introduction of the cavity graph, which will be done in the following section. 

III. THE CAVITY GRAPH 

A simple idea could be to determine 7r, from all the ttj of the neighbors j G N(i) of vertex i. This is not directly 
possible: As discussed above, the nj are single-site quantities and do not contain any information of vertex pairs. 
Any two j G N(i) are, however - via a path crossing i - second neighbors of each other, and thus they are highly 
correlated. Imagine, e.g., that vertex i is not covered, than all j G N(i) have to be simultaneously covered. The 
knowledge of the marginal cover probabilities nj is obviously not sufficient to determine also the central 7Tj . The way 
out is to consider not the full graph, but the cavity graphs: 

Definition: Given a graph G = (V, E), and a vertex i £ V, the cavity graph Gi is the subgraph of G induced by 
the vertex set Vi = V \ i. 

Said with simpler words, the cavity graph is created from the full graph G by removing vertex i as well as its 
incident edges {i, j} for all j G N(i). On a tree graph, the j G N(i) would belong to pairwise distinct connected 
components of the cavity graph, and they would be independent of each other. More generally, on a graph with 
relatively long cycles, any two of the former neighbors of vertex i will be distant on the cavity graph Gi. The basic 
approximation underlying message passing algorithms consists in assuming statistical independence of these vertices 
on the cavity graph (within one thermodynamic state, as will be explained in the case of survey propagation). 

Having defined the cavity graphs d for each vertex i, we also define the generalized probabilities 

_ \{U G Vj | U is min. VC of G l , j G U}\ 

~ \{U cVi\U is min. VC of G t }\ ( ' 

measuring the fraction of minimal vertex covers of the cavity graph Gi containing vertex j ^ i. Even if defined 
formally for any pair of vertices i and j, these quantities will be relevant in particular for those vertices connected by 
an edge in the original graph G, i.e. for {i,j} G E. 

A comment on the statistical- independence assumption has to be included at this point: We are constructing an 
algorithm for real, i.e. finite graphs. This means that graph loops have finite length. The equations we are going 
to present in the following will therefore be only approximations to the exact values of the probabilities 7Tj, and the 
algorithm cannot guarantee to construct a true minimum vertex cover. So, even if the presented algorithm will scale 
only quadratically in the graph order N, it cannot be considered as an exact polynomial algorithm, and therefore 
does not contribute to the solution of the P-NP problem. The importance of message passing algorithms is related 
to practical applications on large graphs, where exact methods fail due to their exponential time requirements. As 
we will see below in numerical simulations, the procedures presented here largely outperform purely local algorithms, 
and therefore allow to construct better approximations to the exact solution. 

IV. WARNING PROPAGATION (WP) 
A. The algorithm 

The very first and simplest message passing procedure we are going to introduce, carries the name warning propa- 
gation (WP). In this case, we are going to calculate only the reduced quantities 

!0 if TT t = 
* if < tt 4 < 1 (3) 
1 if TTi = 1 

and analogously the cavity quantities fr^. So these quantities are not measuring the exact probability for a vertex to 
be covered in a randomly selected minimal vertex cover. They only indicate whether it is always covered (value one), 
never covered (value zero) or sometimes covered and sometimes uncovered. For this last case we have introduced 
the unifying joker state *. Note that also this information is sufficient to be exploited algorithmically: If a vertex is 
assigned the joker state, it can be chosen liberally to be covered or to be uncovered during graph decimation. 

As a first step, we introduce an even simpler message type, the so-called warning sent from a vertex j to a 
neighbor i. This warning incorporates the vertex cover constraint: If the vertex j is uncovered, it sends a warning 
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Uj^i = 1 to vertex i signifying: "Attention, to cover our connecting edge you should be covered, or I have to change 
state." If, on the other hand, vertex j is already covered, it sends the trivial message Uj^i — saying: "I have already 
covered our connecting edge." More formally, a set of warnings is defined for every vertex subset U C V: 

u 3 ^{U)--=[ 1 {i J Hu (4) 

with {i,j} £ E being an arbitrary edge. Note that each edge carries two messages: One sent from i to j, the other 
one from j to i. In a proper VC, at least one of the end- vertices of each edge has to be covered, so we find that 

U C V is VC of G <-> V{£, j} E E : (U) ■ uj^i (U) = , (5) 

i.e. each edge has to carry at least one trivial warning. The definition of the warning can also be extended to sets A4 
of vertex subsets. We define 

Uj^AM) := min Uj-*(U) , (6) 
ueM 

i.e. a non-trivial message is sent if and only if vertex j is element of no U S M.. This definition obviously reproduces 
the warning J3J if M. consists of only one vertex subset. The reason for selecting the minimum in the last definition 
will become clear below. Using the set Si of all minimal vertex covers of the cavity graph Gi as a special case, the 
warning Uj^i{Si) becomes a function of TTj\i only. For an arbitrary but fixed edge we find 

f 1 if = 

Uj^i(Si) = Uj-tiin^i) = < if TTj\i = * . (7) 

[ if Ttyi = 1 

The required minimality of the vertex cover to be constructed leads to a simple propagation of these warnings, 
or equivalently of the corresponding ftju. This can be achieved by considering how minimal vertex covers can be 
extended from the cavity graph to the full graph. There are three cases, cf. Fig. Q] 

(a) There exists at least one minimal vertex cover of the cavity graph Gi where all j £ N(i) (neighbors of i in the 
full graph G) are simultaneously covered. These VCs are also minimal VCs of the full graph G since all edges 
incident to i are already covered, so i has to be uncovered to guarantee minimality. The sizes of the minimal 
VCs of Gi and those of G thus coincide. In this case we find 7r.; = 0, since there are no minimal VCs of G 
containing i. 

(b) All minimal vertex covers of Gi leave at least two j £ N(i) uncovered. Since all edges incident to vertex i have 
to be covered, we have to add i to the VC of G; in order to extend it to the full graph. The VC of the full graph 
contains thus exactly one vertex more than those of the cavity graph, and iii equals one. 

(c) In the last, intermediate case, there is at least one minimal VC of Gi containing all but one j £ N(i), but there 
is none containing all j £ N(i). Also in this case, we have to add exactly one vertex by going from a VC of the 
cavity graph Gi to one of the full graph G, the VC size grows by one. If we, however, use the VC leaving only 
one j £ N(i) uncovered, there exists only one single uncovered edge in G. To cover it, we can select any one of 
its two end vertices, i.e. either i or its neighbor. In this case, we therefore find 7fj = *, i.e. vertex i is found to 
be in the joker state. 

At this point, the independence assumption of all j £ N(i) in the cavity graph enters into the discussion: We consider 
their joint probability of simultaneously being covered in a minimal VC of the cavity graph Gi, and assume this 
quantity to factorize into rijejvfi) Under this assumption, case (a) happens if and only if all 7^ ^ 0. Case (b) 
happens if there are at least two vanishing nju in between the j £ N(i), and the third case appears for exactly one 
zero 7fju. We see that in this rule no difference between always covered and joker vertices j £ N(i) exists, which 
explains the use of the minimum warning in definition ©. We conclude: 

( if Ejgjv(i) "i-ifali) = 
n=< * if EjeN(i) u 3^i(m) = 1 • (8) 

This rule is graphically represented in Fig. ^ The cavity quantities TTju can now be calculated by considering the 



FIG. 1: Graphical representation of Eq. JSj, with vertex i being identified with the lower vertex in each sub- figure. The color 
coding of the vertices corresponds to the values of Hi and the ftju: Value zero is represented by a white dot, value one by a 
black dot, and the joker value * by a gray dot. In case (a), there are no white dots between the j £ N(i), so the lower vertex 
has not to be covered and gets color white. If there is exactly one white dot in the upper line, the lower vertex becomes gray, 
cf. (b). If there are two or more white dots in the upper line, as in (c), the lower vertex is black, corresponding to an always 
covered vertex. 



cavity graphs Gj, and by disregarding in addition the influence of vertex i: 



Eqs. (I7I9H arc called warning propagation. The last equation, together with Eq. Q), describes a closed system of 2\E\ 
equations: Two for each edge due to the two different possible orientations of the messages. Note that Eqs. J7J and © 
can also be reformulated for the warnings Uj^i itself, eliminating the cavity quantities ftju- The iterative equations 
take the particularly simple form 



where, for better readability, we have used the notation 5{-, •) for the Kronecker symbol. 

These equations have to be solved and plugged into Eq. © in order to calculate the values of all tti. Even if this 
information is not yet sufficient to immediately solve the minimal VC problem, we can already read off a lot of useful 
information about the properties of all minimal vertex covers. The most important quantity is an estimate for the 
minimal VC cardinality: 




(9) 




(10) 




(11) 
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The prefactor 1/2 in front of the number of joker vertices is not a direct result of WP. It can be justified using the 
more detailed belief propagation calculating the full single site probability 7Tj 8] , or via the replica method |16| . 

The WP equations can be used to construct a vertex cover, i.e. they can be exploited algorithmically. This is done 
in the following way, starting with an initial graph G = (V,E) and an empty set {7 = 0: 

1. The 2\E\ warnings Uj_>j are initialized randomly. 

2. Then, sequentially, edges are selected and the warnings are updated using Eq. (|10|) . This update is iterated 
until a solution of the warning-propagation equations is found. 

3. The TTi are calculated from the warnings using Eq. (jSJ). 

4. All vertices with fr, = 1 are added to {7, and deleted with their incident edges from G. 

5. All vertices with 7Tj = are deleted from V, without changing U. Since a vertex with iti = has only neighbors 
of TTi = 1, it was already isolated after the last step. No edges have therefore to be removed from E. 

6. One remaining vertex i (%i — *) is selected, and all its neighbors j € N(i) are added to U. Vertices i and N(i) 
are removed from V, and all their incident edges are subtracted from E. 

7. If uncovered edges are left, we go back to step 2, and recalculate the warnings on the decimated graph. If no 
edges are left, the current U is returned as a solution. 

Obviously, the constructed U forms a vertex cover, since only covered edges are removed from the graph. It is also 
a minimal one, if the information provided by the TTi was correct. Due to the factorization hypothesis in WP, some 
of the 7fi may, however, be erroneous, resulting possibly in a non-minimal cover. It is worth to note that after each 
graph decimation step followed by a re-iteration of the WP equations, a new estimate of the VC size can be calculated 
according to Eq. i|ll|) . This estimate is expected to be stationary only in the case where already the initial warnings 
where exact, and to change under the algorithm if the latter were only approximations. 



B. From single samples to average results on random graphs 

Starting from Eqs. I|1(J|) and |(HJ|, we can easily reconstruct the replica-symmetric typical-case results for random 
graphs of average vertex degree c. We start with defining the global histogram of warnings, 

Q( u ) = ^m Yj [S(ui^j,u) + 5( Uj ^i,u)} . (12) 

1 1 (i,j)eE 

Due to the binary nature of the warnings, it can be parametrized as 

Q{u)= Po S(u,Q)+ Pl S(u,l) (13) 

with po + Pi = 1- Consider now Eq. IjlOjl: A non-trivial warning is sent via a link j —> i only if the input messages 
Uk-,j from all k £ N(j) \ i equal zero. This happens for all warnings independently with probability p , and the 
number d of these incoming messages is, on a random graph, distributed according to a Poissonian of mean c. We 
thus find 

pi = J2 e ~ c ^p d o = e ' cpi . ( 14 ) 

d=Q 

which, using the Lambert-I^ function, is solved by 

W(c) , . 

Pi = ■ 15 

c 

Let us now reconstruct also the histogram 

P « = ^E^^) ( 16 ) 

iev 
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of single-site marginals 7fj. The latter are three- valued, we thus parametrize the histogram as 

P(7r) = I^,J(7T, 0) + *) + i^5(tt, 1) . (17) 

Using Eq. (jSJ) and the Poissonian degree distribution, we find 





OO 






ci=0 




= Pi 






OO 
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V* 
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= c Pl 




V\ 


= 1 — ^* 


- ^0 



(18) 

For the derivation of the second expression we have used that the single non-zero warning among the messages reaching 
a joker vertex can be chosen liberally in between all d incoming edges. For the VC size we thus find 

,. X W(c) W(c) 2 
x(c) = lim — = 1 ^ ^- (19) 

which is identical to the result of a replica-symmetric calculation 17). For average degrees c < e, this result was 
shown to be exact |l8j | , and it can in fact already be read off from an older result by Karp and Sipser |29| on maximal 
matchings in random graphs, see also [23,01 for related statistical-physics approaches. 



C. Bug proliferation and the stability of the WP fixed point 



Besides the problem that the solution of the equations of warning propagation may be imprecise due to the existence 
of short loops in the graph, there can be another problem - the iteration of the warning update may fail to converge. 
This can happen again due to the existence of short loops, which may lead to attractive limit cycles in the iterative 
warning dynamics. Another problem can appear due to the existence of many solutions of Eqs. JjJj. In statistical 
physics we say that the replica symmetry is broken. 

To be more quantitative, we study here the stability of a WP solution with respect to the introduction of a bug 
[32^ : One of the warnings Uj^i is changed to its opposite value. After one iteration of WP, the bug itself will be 
cured since it depends only on unchanged messages. On the other hand, the warnings from vertex i to its neighbors 
k € N(i) \ j may be changed, i.e. new bugs may appear. The question is now if these bugs proliferate and, after some 
iterations, change a finite fraction of all warnings, or, if the bugs die out after a while. Only in the second case, WP 
is stable and can be usefully included into a decimation procedure. 

Here, we perform this analysis analytically for the case of a random graph of average degree c. In this case, 
the number d of neighbors k € N(i) \ j receiving messages depending on the bug is distributed according to the 
Poissonian e~ c c d /dl. They send themselves warnings to vertex i which are, due to the locally tree-like structure 
of a random graph, independent on itj— >i, and can be considered to be randomly selected according to the global 
histogram Q(u) — poS(u,0) + p\5(u, 1) of warnings introduced in Eq. (|12f> . 

We have to distinguish two cases for the introduction of a bug: 

(i) We change the message Uj—^i from one to zero. 

Prior to this change, all out-messages iti_>fc with k £ N(i) \j were equal to zero, cf. Eq. 1|10[) . Let us denote by 
d = \N(i) \ j\ the number of the out-messages depending on Uj_>i, i.e., the degree of vertex i equals d+ 1. 

After the introduction of the bug, an out-message becomes one if and only if all other in-messages ui^i 

with / £ N(i) \ {j, k} are zero. There are two sub-cases. First, with probability Pq, all messages Uk^i equal zero. 
In this case, all d out-messages change. Second, with probability d ( o _1 (l — po), exactly one message Uk—,i has 
value one, all other zero. In this case, only Ui—,k changes under iteration. On average, the bug introduces thus 

E e ~ C ^ W + d Pt X i> - Po)] = c e-^-'o) = c e- c "i 

d 

new bugs into the graph. These bugs are of the second type. 
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(ii) We change the message Uj^i from zero to one. 

After introduction of the bug, all out-messages with k £ N(i) \ j become zero under WP update. They 

are bugs only if, in the initial WP solution, they had the value one. Using analogous arguments to the first 
case, we find that, with probability p$, all d out-messages were one, and, with probability c?Pq — 1 (1 — po)> only 
one single message was one and becomes a new bug. The expected number of new bugs caused by the changed 
Uj^i equals again c e~ cpl . 

We now apply a simple percolation-type argument: If the average number of new bugs is smaller than one, the bugs 
are expected to be cured after a few iterations, the WP solution is stable under bug introduction. If, on the other 
hand, the average number of new bugs is larger than one, we expect an exponential increase in the bug number. Bugs 
proliferate and carry away the system from the WP fixed point. The latter is thus concluded to be unstable. Note 
that this arguments holds only because we update out-messages which, under iterated WP updates, do not interact 
because they all influence disjoint sets of further warnings. 

The critical point can now be determined easily: The average number of new bugs is set to one, cwp e~ CWFPl = 1. 
Comparing it to the self-consistent Eq. I)15|l. we immediately conclude the 

1 . , 

cwp = — 7 r = e . (20) 

Pl(CWP) 

WP converges below this critical connectivity, i.e. in the full region where replica symmetry is exact. As one would 
expect intuitively, it does not converge in the replica-symmetry broken phase above average degree e, there survey 
propagation as discussed in the following section of this work has to be applied. 



D. Bug relaxation time of WP 

Even if WP provides asymptotically exact results in the full replica symmetric phase in a running time scaling 
quadratically with N, its convergence slows down if we approach the critical average degree. This can be seen 
analytically by calculating the evolution of the number of erroneous messages, or bugs, under various update schemes. 



1. Parallel update 



Let us start with a parallel update scheme, where, in every iteration step, all messages are recalculated simulta- 
neously from the old messages. Assume, that there are Mi <C 2M = 2\E\ erroneous messages. These are, up to 
higher-order effects, isolated from each other and act thus independently under WP iteration. Each of these bugs 
becomes thus corrected in a new WP step, but causes, as seen in the last sub-section, on average c e~ cpi = cpi new 
wrong messages. Again, up to higher order corrections, these messages do not interact. For the expected number 
Mi(t) we thus find 

Mi(i) = (cpifMrfti) , (21) 
i.e., below cwp = e, this number decays exponentially with a time scale 

Tpar = ~M^) ' (22) 

This relaxation time diverges is we approach c = e from below. To unveil the critical behavior, we set c = e — e 
(0 < e < 1). With pi = l/e + S we find, using Eq. (|T5|> . 

- +5 = cxp(-l + - - eS + 0(e5)} , (23) 
e I e J 

i.e. S — e/(2e 2 ), resulting in cp\ = 1 — e/(2e) + 0(e 2 ), and thus in 

2e 

r par ~ for < e - c < 1 . (24) 

e — c 

The critical exponent one is expected to result from the mean-field structure of the underlying graph. 
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2. Random update 



The situation is slightly more involved in the case of a random update, where in every time step one message is 
selected randomly out of all 2M warnings, and is updated according to the WP equation. Let us denote by pr(Mi) 
the probability that there are M\ erroneous messages after T random update steps. Its evolution under WP is given 
by the rate equation 

PT+Mh) = pr(Mi) - ^PHMt) + M±±±p T ( Ml + 1) - C ^Mlp T ( Ml ) + Cpli 2M~ 1} Pt{Mi - 1) • (25) 

This is due to the fact, that, with probability M\/(2M) we pick a bug and correct it - changing M\ >— > Mi — 1, 
and with probability cp\M\/{2M) we pick a "child" of a bug which becomes changed (remember that a bug has on 
average c children messages, but only a fraction pi of these becomes changed when updated under WP) - changing 
Mi i-> Mi + 1. 

After 2M random updates, each message is, on average, visited once. To obtain time-scales comparable to the 
parallel update, we therefore have to rescale time as t — T/(2M), identifying a single random update with the 
asymptotically infinitesimal time step dt — 1/(2M). In this limit, Eqs. I125H can be rewritten as a system of ordinary 
differential equations, 

j t Pt{M x ) = -Mip t (Mi) + (Mi + \)p t {Mi + 1) - cpiMip t (Mi) + c P i(Mi - l) Pt (Mi - 1) . (26) 



For the time evolution of the average number Mi(t) = •^lPtC^i) °f bugs we thus find 



— Mi(t) = -Mi 2 + Mi(Mi-l)-cpiMi 2 + cpiMi(Mi + l) 



= -{I - cpi)Mi(t) . (27) 

It decays exponentially with 

Trand = ~ (28) 

1 - epi 

and shows thus the same critical behavior as the parallel update. The main difference appears for c — > 0: Whereas 
the parallel relaxation time goes to zero, T ranc [ approaches one. This reflects the persistence time that a message is 
not updated at all: The fraction of variables which are not selected in N single-spin updates is e _1 . 

Note that the algorithm as presented in Sec. IIV Pl uses a third update scheme, namely sequential update, which is 
asynchronous but sees every message exactly once in 2M steps. The analytical description is more involved than the 
one of a simple parallel or random update, but the critical behavior is expected to remain unchanged. For small c, 
the behavior is further on expected to be more similar to the one of the parallel update scheme: Since every message 
is seen exactly once in 2M steps, there are no persistence effects. 



E. Numerical tests 



We have performed numerical tests of WP on randomly generated instances of random graphs at various connec- 
tivities c < e, for graph sizes up to N = 10 5 . 

To verify the results, we have also applied the leaf-removal algorithm which was shown |18| to output exact results 
exactly in the same connectivity region, and which is the basis of the proof of correctness of the replica-symmetric 
result. The algorithm works as follows: In every step, a leaf (vertex of degree one) is selected, its neighbor is covered 
and both vertices are removed from the graph, as well as all their incident edges. If this algorithm is able to cover 
the full graph, the generated VC is a minimum one, but the algorithm fails if, possibly after some decimation steps, 
a leaf-free subgraph emerges. 

We have found that both algorithms produce in almost all cases identical results, i.e. the WP output is thereby 
shown to be exact. Also the initial estimate of the VC size after the first converges of WP, before starting graph 
decimation, was found to coincide in the most majority of all cases with the final output. As discussed above, this is 
a signal that already the first convergence of WP leads to exact messages. 

The problem of WP is, as discussed before, the slowing down and final non-convergence if we approach (or exceed) 
an average degree c = e. In Fig. BJ we have quantified this phenomenon. We have measured the fraction of graphs of 
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FIG. 2: Convergence probability of WP as a function of the average degree, for graph of N = 250, 1000 resp. 4000 vertices. 
The symbols signify the fraction of graphs, where after 1000 sequential updates at least 99% of the warnings are converged, 
measured for 10000, 3000 resp. 1000 sample graphs. The dashed vertical line is situated at c = e, where WP theoretically 
ceases to converge. 



given average degree c (and given N) which, within 1000 sequential WP updates of all 2M messages, are converged 
on more than 99% of all messages. In the figure, we see a clear drop of this probability from almost one to zero in 
a region concentrated close to c = e. This drop sharpens considerably with growing graph size N, and suggests thus 
the existence of a sharp transition in the WP behavior in the thermodynamic limit N — > oo. Note that, in Fig. El this 
transition seems to be at a graph degree being slightly larger than c = e. This is a result of the measured quantity: 
The transition should be found exactly in c = e when for an arbitrarily large, but finite number of updates almost all 
messages are converged - instead of the test values used in the generation of Fig. [5] 



V. SURVEY PROPAGATION (SP) 

We have already mentioned the possibility that the equations of warning propagation possess a high number of 
solutions, and none can be found using a local iterative update scheme. The messages would try to converge to different, 
conflicting solutions in different regions of the graph, and global convergence cannot be achieved. In physics' language, 
these different solutions correspond to different thermodynamic states - to be understood as clusters of minimal VCs. 
Inside such a cluster, any two VCs are connected by at least one path via other (almost) minimal VCs, which differ 
stepwise only by a small number of elements (the number of these different elements stays finite in the thermodynamic 
limit). For two minimal VCs selected from two different clusters, no such connecting paths exist, at least once an 
extensive step has to be performed. Note that this distinction is, from a mathematical point of view, not well-defined 
for finite graphs - which are the objects of our algorithms. There can be, however, a clear separation of distance scales 
which practically allows for an identification of solution clusters. 

As already said, warning propagation works well only if there is a single cluster (or a very small number of clusters) 
- corresponding to the replica symmetric solution. A breaking of the replica symmetry implies the emergence of 
clustering in the solution space. This effect is taken into account by the survey propagation (SP) algorithm, as first 
proposed in [ill l3^ | . This algorithm is equivalent to the first step of replica symmetry breaking, where the solution 
clusters show no further organization. If there are clusters of clusters etc., one has to go beyond survey propagation. 
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FIG. 3: Schematic graphical representation of the organization of optimal solutions for warning propagation (left side) and for 
survey propagation (right side). For the first case, all solutions are collected in one large unstructured cluster (or in a very 
small number of these clusters, as in the case of a ferromagnet), corresponding to unbroken replica symmetry. In the second 
case, the set of solutions is clustered into a large number of extensively separated subsets. Survey propagation corresponds to 
one step of replica symmetry breaking, where there is no further organization of the clusters in larger clusters. 



A. The algorithm 



Let us, however, assume a clustering only on one level. Instead of denning probabilities like 7Tj over the full solution 
space, we consider for a moment only one cluster. Inside such a cluster of minimum VCs, a vertex i may either be 
always covered (state 1), never covered (state 0) or sometimes covered and sometimes not (joker state *). This means 
that, for single clusters, we treat the problem on the same level as WP. 

However, the assignment of this three-valued vertex state may vary from cluster to cluster. We now denote by 7r| 
the fraction of clusters where vertex i takes state one, by fr,- the fraction of clusters where vertex i takes state zero, 
and by tt\ ' the fraction of clusters where vertex i takes the joker state *. Analogously we define the cavity quantities 
and Ttj*) on the cavity graph Gi. A crucial assumption of SP is that clusters do not change dramatically by 
eliminating one vertex from the graph, i. e., by going back and forth between the full graph and the cavity graphs for 
different cavities. 

Again, we can distinguish the three cases in Fig.^of how the variable states propagate inside each solution cluster. 
A vertex i of state has to have all neighbors in states 1 or * on the cavity graph Gi ; a vertex i of state * has to have 
exactly one neighbor of state on the cavity graph; a vertex i of state 1 has at least two neighbors which have state 
on the cavity graph. The statistics over all clusters can now be performed in a very simple way. The fraction of 
clusters having vertex i in state which by definition is tv\ equals the fraction of solution clusters of the cavity graph 
Gi where all neighbors are in a state different from 0, and so on, for the other two states. This procedure guarantees 
the minimization inside each cluster. Note, however, that in clusters belonging to the first case no vertex has to be 
added to the minimal VC by stepping from the cavity graph to the full graph, whereas the VC size increases by one 
in the second and third case. The VCs of different clusters thus grow differently. To optimize between clusters, we 
therefore introduce a penalty e~ v to the last two cases. The resulting equations are 



jeN(i) 

jeN(i) j'eN(i)\j 



= C-'e-y 



- \ 
T vi ■ 



(29) 



i- n a-*®- e n a-*® 



ieN(i) 



j£N(i) j'eN(i)\j 



and the normalization constant is given by 



Gi — 



l-(l-e«) n 



(1 -*<??) 



(30) 



Note that we have again made an assumption of statistical independence of the vertices j on the cavity graph. This 
assumption enters on two levels: First inside the cluster, when we say that j vertices of state * can be covered 
simultaneously in a minimum VC of the cavity graph; and second in between clusters, when we factorize the joint 
probabilities in the upper expression. 
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Analogous equations are valid for the iteration of the cavity quantities, where again the influence of the cavity site 
has to be taken out: 



do) 



d*) 



di) 
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jeN(i)\i 
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j£N(i)\l 



j'EN(i)\{j,iy 
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jeN(i)\i 
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j£N(i)\l 



d°) 
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j'eJV(t)\OV} 
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T v I ■ 



~(0)n 
71" l ■ 



(31) 



These are the equations for survey propagation. To solve them, one has to first initialize the cavity quantities arbitrarily 
and update them iteratively according to the second set of equations. Once convergence is reached, the 7rj ^ can be 
simply evaluated from the first set of equations. Note also that the SP equation for the cavity quantities close in the 



tt^ alone: 

3\t 



l\l 



n 



jeN(i)\i 



(1 



l-(l- e »)n 



(1 



(32) 



A note on the selection of the re- weighting parameter y is necessary: Finite values of y focus on local minima of the 
complex landscape X(U) — \U\ defined over all VCs, i.e. to VCs of cardinality which cannot be decreased by changing 
only a finite part of U. One would thus expect naively that minimal VCs are obtained in the limit y — > oo. As we 
will see in the next section, the SP solution carries, however, sensible physical information only in a limited interval 
y G [0, y*]. It is therefore necessary to work directly with finite y- values. 

The knowledge of all tt| ' does not allow us to directly create a (locally) minimal vertex cover. It is impossible to 
deduce a joint probability distribution of all vertices from the knowledge of the marginal single- vertex probabilities 
only. Nevertheless, some useful knowledge can be drawn directly from these quantities. In particular we may estimate 
the VC size by 



x (y) = E 



d 1 ) 



1 



.(*) 



(33) 



As in the replica-symmetric case of WP, we have assumed that vertices carrying value * are, on average, half covered 
and half uncovered. At this point, this is a pure conjecture which, however, will be justified in the next section [see 
Eq. JBSJ]. 

To actually construct a minimum vertex cover (or an approximation due to the non-exactness of SP because of, 
e.g., a finite value of y, cycles in the graph or more levels of cluster organization), we have to resort again to an 

iterative decimation scheme. In every step, the 7rj are calculated, one vertex of large tt^ is selected and covered. It 

is removed from the graph together with all incident edges, and the tt| "* are reiterated on the decimated graph. This 
procedure is iterated until no uncovered edges are left, i. e., until a proper VC is constructed. Slightly different schemes 
of selection heuristics can be applied (select a vertex of high 7r l -°' ) , uncover it, cover all neighbors, and decimate the 

graph, or take into account also the value of tt^). All these heuristic rules are equally valid in the sense that, if SP 
is exact on a graph, they all produce one minimum VC of the same size. For real graphs, however, where the results 
of the SP equations are to be considered as approximations of the actual quantities defined over the set of solutions, 
different heuristic choices may result in VCs of different sizes. Within the numerical experiments described below, we 
have, however, found no preferable selection heuristic, and the fluctuations from one heuristic to another were small 
compared with the VC size. 



B. The complexity of clusters 



Different values of the re-weighting parameter y lead to a concentration of the partition sum (or, equivalently, the 
solution of the SP equations) to clusters of vertex covers of different (locally minimal) size. The complexity H(X), or 
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configurational entropy, measures the logarithm of the number J\[ c i{X) of clusters of given VC size X. We introduce 
the generalized thermodynamic potential <&(?/) as the Legendrc transform of the complexity, 



N 

-vHv) — 



^cxp{-yV + £(V)} . (34) 



According to the general procedure of the cavity method in diluted systems ^3 > this potential can be decomposed 
into site and link contributions, 

$(y) = J2 A< M^)- ( 35 ) 

These contributions can be determined by adding a vertex / an edge to the graph: 

jGN(i) \ j£N(i) 



= e -y + (i-e-y) [] (i-*jff) 

ieJV(i) 

"» A *«W = 1 - (1 - e-y^h^ (36) 



where, as in the derivation of the SP equations, one has to take care separately of the cases where the VC size 

(0) 
i\j 



remains unchanged under vertex or link addition, or increases by one. Having solved the SP equations for the 7T.-P- , 



the potential becomes easy to calculate, 

- y $(y) = J2tele-v + (l-e-v) [] (1-*$)]- E ^(i-C 1 -^)^^) ■ (37) 
iev \ jeiv(i) / {.j}eB 

Approximating the sum in Eq. I|34|) by the saddle point method (valid for N S> 1), we see that the complexity can be 
calculated via 



Z(y) = Z(X(y))=y(X(y)-<f>(y)) , (38) 

where X(y) is given in Eq. (|33|l in dependence of the SP solution. The function X{y) can also be determined directly 
from the potential <&(?/) via X(y) = Q(y) + y& (y). The numerical observation that both expressions for X(y) coincide 
is a strong justification for the ratio 2 used in Eq. I|33|l between the number of all unfrozen vertices and the number 
of simultaneously covered unfrozen vertices. 

The complexity £ is defined as the logarithm of the cluster number, i.e. in the presence of at least one cluster it takes 
necessarily a non-negative value. This defines a range y S (0, y*) where the SP equation provide a potentially sensible 
solution, with y* given by the marginality condition £(y*) = 0. For higher y, the predicted complexities become 
negative - corresponding thus to un-physical solutions of the SP equations. We see that the naive expectation that 
y — > oo leads to minimal VCs is thus inconsistent, the best possible estimate for the minimal VC size we can obtain 
at the level of SP (one-step replica symmetry breaking) is thus given by X(y*) [n|. Note that this observation, in 
replica theory corresponds to the usual optimization of the replicated free energy over the replica symmetry breaking 
parameter [34], |35| . Note also that the existence of a finite y* is a clear signal for the existence of more than one step 
of replica symmetry breaking, and the SP results can only be expected to be approximations to true minimal VCs. 



C. Stability of the fixed point under SP iteration 



It is, however, not clear if SP converges at all in the replica-symmetry broken phase. To investigate this question, 
we consider the behavior of the solution of Eq. (|32|l under small perturbations. Note that the situation here is different 
from the bug proliferation picture used in order to analyze the stability of WP fixed points: The messages now are 
real numbers and thus small perturbations are possible even on the level of a single message. 
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Let us therefore imagine that we start a set of experiments, with initial conditions 7rlj distributed around the SP 
solution Trjff according to some narrow distribution 



27r£i 



■ exp 



1 1 £ l\l 



2 ^ 



2e" 



(39) 



of link-dependent widths e^u <C 1. After one iteration of SP, the messages are distributed according to (for simplicity 
we have skipped the superscript (0)) 



f'i\Mi\l) = J II [ d7r j\ifj\i( n i\i)] S ( n i\l ~ ^({^li})) 



with the update rule 7r given by Eq. I|32|) . We expand this update function around the SP fixed point, 



Q»(0) 



jeiv(i)V 07r i|i 
The mean of the updated message is given by 
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(42) 



and its change is negligible with respect to the width of the distribution. The second moment, on the other hand, 
behaves as 



4>' 



j£N(i)\l 



pfL ~ (0) 
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(43) 



We find thus that the variance of the updated distribution behaves as 



e'J = (nlY 



E 

j£N(i)\l 



(44) 



with 



T 



(o) 



e v UkeN(i)\{ 3 ,i}( 1 ^kll) 



a- 
a7r., . 



e-v+(l- e -»)n fcewWNJ (l-*g) 



(45) 



The (in)stability of this equation is related to the largest eigenvalue \ ma x of the matrix (TjUjij), only if A maa; is 
smaller than one the perturbations fi\i{^l?i) of the SP solution contract exponentially. 

Note that this type of stability of the SP fixed point is known in the literature under the name "type-one instability" 
[32l l3fil | and can be related to the appearance of more than one step of replica symmetry breaking, more precisely to 
the fragmentation of the solution clusters in sub-clusters. It is not the only type of instability of the one-step replica- 
symmetry-broken solution with respect to more steps, an alternative scheme would be the accumulation of clusters 
into clusters of clusters ("type-two instability"). The latter instability leads, however, not to an iterative instability 
of the SP equations itself, i.e. the later can be used even if not being physically exact. This is what happens in the 
case of VC 



:qua 
.19] 
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D. Numerical tests 



1. The size of the constructed VC 



In order to check the performance of SP, we have tested it on single samples of random graphs of medium to large 
size. In Fig.^J we concentrate on a single graph of N = 50 000 vertices and average degree c = 10. The data reported 
in Fig. 01 are quantitatively comparable to other graphs with the same parameters, and qualitatively with graphs of 
other sizes and connectivities. We show trajectories of the estimated VC size during graph reduction, as a function of 
the number of vertices which are still in the reduced graph, for various values of the re- weighting parameter y. We see 
that the initial variability of the estimates is much larger than the difference in output. Even the worst performing 
case, y — 0, outputs a VC of 34 171 vertices compared to the minimal found one with X = 34 104. This similarity 
is due to the fact that the ranking of the vertices with respect to the SP results depends only weakly on y, whereas 
the messages itself change considerably - and thus the corresponding predictions of the VC size. Close to the end 
of the curves, there are some striking fluctuations in the VC size. In this region, SP was not able to converge to 
a fixed point, and the non-converged solution was used. This non-convergence of SP may be related to the critical 
slowing down of SP at the phase boundary when the solution space of the minimal VC problem transits between the 
two schemes of Fig. [31 (see Sec. I VI D|) . After an interval of these fluctuations, the SP solution collapses to the WP 
solution, i.e. we go from the replica-symmetry broken, clustered phase to the replica-symmetric, unclustered phase. 
Note that the performance of SP improves with increasing values of y, as long as it converges in the most majority of 
the decimation steps. The region of non-convergence, however, grows with y. 




FIG. 4: Trajectories of graph decimation for a single graph with N — 50 000 and c = 10. We plot the VC size as estimated by 
SP, as a function of the vertex number in the remaining graph. The decimation proceeds from the right to the left, i.e. from 
the initial N — 50 000 toward zero. The fact that the estimate changes under application of the reduction process results from 
the approximate nature of the SP messages. 

To circumvent this problem, we have introduced a version of SP with adaptive y-values. We start SP with a 
relatively large y, and whenever the convergence time exceeds a certain threshold (we have used, e.g., 100 sequential 
updates of all messages), the value of y is decreased (we have, e.g., multiplied it by 0.9). As a result, the trajectory 
of predicted VC sizes is smoothened, and the algorithm automatically tends toward the lowest found VC sizes. 

As already mentioned, the original estimate for X varies a lot with y, it is even non-monotonous. Whereas the value 
for y — is substantially smaller than the smallest constructed VCs, there is a local maximum which is larger than 
the constructed VCs. It is, however, astonishing that the extrapolated value at y*, where the complexity £ vanishes, 
is extremely close to the finally constructed value: 34 090 ± 10 compared to 34 104. This is even more astonishing 
since we do not reach convergence of SP at y* for c = 10, see the discussion in the next sub-section. 

To see the behavior of SP in the full range of average degrees, we have systematically scanned the c- interval [10, 400], 
as can be seen in Fig. The graph size for these high connectivities range up to N — 6400 (note that in this case 
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FIG. 5: Numerical results of SP run on graphs of high, but finite average degree c £ [10,400]. As a comparison, we have 
added the results of two local algorithms (Gazmuri's heuristic and GLR), and rigorous bounds on the asymptotic average 
behavior for random graphs as well as the exact large-c asymptotics. SP clearly outperforms local algorithms, and is close to 
the asymptotically exact value. 

up to cN ~ 2 560 000 messages have to be handled). The results for fixed c and various TV were extrapolated to 
their asymptotic value at N — > oo, in order to be comparable to analytical results and to the performance of local 
algorithms. We see that SP performs much better than the local heuristics, and its behavior is consistent to the exact 
large-c behavior found by Frieze |l4j . For the comparison we have used two local algorithms: The first one is a simple 
heuristic by Gazmuri |15|. where in every step a vertex is selected randomly, all its neighbors are covered, and the 
covered edges are removed from the graph. The second heuristic is a generalization of leaf removal [3]| working also 
beyond average degree c = e, but not guaranteeing minimality of the constructed VC any more. The algorithm selects 
in every step a vertex of minimal degree, covers its neighbors and removes all considered vertices and covered edges. 
If the algorithm never needs to select vertices of degree exceeding one, it reduces to leaf removal. As already said, SP 
outperforms the local algorithms. 

A drawback for all c-values is, however, that the algorithm does not work at high values of y, which, seen the 
derivation of the SP equations, should bring us closest to a minimal VC. 

2. On the iterative stability of SP 

Running SP for different values of the re- weighting parameter y, we observe that it converges very fast for small y 
(and c^e), but it does not converge at all for large y. As a first impression, it seems therefore useless to check the 
stability of the SP solution via the eigenvalues of the stability matrix (T^^u). The solution itself is found via iteration 
of Eq. (|32|l starting from a random initial condition, i.e. if this iteration converges, the solution is automatically stable. 
On the other hand, it is much harder to extrapolate precisely the point where the convergence time diverges - instead 
of identifying this point by X ma x ™> 1- 

Technically the eigenvalue can be determined in a way inspired by the message-passing procedure itself: We ran- 
domly initialize all e^j to non-negative values, and update it according to Eq. (|44|l . Then, we renormalize the vector 

dividing it by j)eE( £ i\j 2 + e i|i 2 )- This is repeated until convergence of the procedure is reached, and X max 

equals the asymptotic renormalization factor, see |38j | for an analogous approach to testing the stability of a replica 
symmetric solution in the problem of counting graph loops. 

The results of the numerical tests are shown in Fig. [BJ For the c-values displayed there, we find that the SP solution 
is stable against small perturbations in the vicinity of y = 0, but A„ larE starts to grow right away. For all the displayed 
values, we also find that X m ax approaches one at positive complexity, i.e. at values y < y*. At the y- value which, in 
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FIG. 6: Stability of the SP solution: The complexity E is plotted versus the largest eigenvector X m ax of the stability matrix 
defined in Eq. 1145 1 . for various values of the average graph degree c. All data are produced from graphs with N = 10 000 
vertices, averaged over 10 samples. Error bars both in A and a are smaller than the symbol size. 



the ensemble average, the 1RSB result is expected to be most precise compared to the exact value, SP does not even 
converge on the single sample. 

This changes for c > 20.4. At this point, the instability threshold coincides precisely with the zero-complexity point 
corresponding to minimal VCs. At even higher value, SP thus converges at y = y*. However, this does not necessarily 
mean that we can do all the decimation process efficiently at the initial y* , after decimation of a c-dependent fraction 
of the graph SP starts to diverge even for large c. 



VI. FROM SURVEY PROPAGATION TO TYPICAL PROPERTIES ON RANDOM GRAPHS 

In Sec. IIVBI we have seen, that it is possible to average the solution of warning propagation over random graphs 
of average vertex degree c, and to recover the replica-symmetric results of 01 i n the thermodynamic limit. In 
analogy, the equations of survey propagation can be used to reproduce and extend the results of [19j on the typical 
properties of minimal VCs under the assumption of one step of replica-symmetry breaking (1RSB), i.e., to translate 
the probabilistic-algorithmic approach on single graph instances to a statistical-physics approach with the graph 
randomness playing the role of the quenched disorder. 

As already explained above, the computational hardness of the VC problem results from the fact that the landscape 
of VC sizes over the space of all vertex covers may become rough. It may contain, in particular, many local minima: 
A VC U is considered to be locally optimal if all VCs differing only in a finite number of vertices are as least as large 
as U. Such local minima are expected to act as traps for many local search algorithms, which therefore are unable to 
find globally optimal VCs. If 1RSB is considered, not only the minimal VCs are assumed to be clustered (cf. Fig. 05} , 
but also an exponential number of clusters of locally optimal VCs are expected to appear. The total number of such 
VC clusters of cardinality X — xN in a given graph G is denoted as fta(X). The complexity of graph G at VC 
density (or 'energy density') x is defined as 

S G (X) = ilnO G (X) , (46) 

with respect to Sec. IVBl we have renormalized the complexit y by a factor 1/N to assure a sensible thermodynamic 
limit. The complexity Eg(X) is expected to be self-averaging [34]]: When N is increased, the complexity of randomly 
drawn graphs G approaches asymptotically the mean value averaged over the whole graph ensemble. Technically, the 
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interesting quantity is thus 



E(c, x) 



lim -(\ n n G (xN)) G 

N—>-og ly 



(47) 



where (• • ■ )g denotes average over all random graphs G of fixed parameters N and c, since this value is found almost 
surely also in very large random graphs. 

The partial derivative of E(c, x) with respect to x is denoted as 



y 



9S(c,x) 
dx 



(48) 



The above equation gives an implicit relationship between y and the VC density x. We can define a generalized free 
energy density at given value of y and c via the Legendre transform 



(j)(c,y) = x(c,y) 



T,(c,x(c,y)) 



y 



(49) 



in complete analogy to the single-graph quantity $ in Eq. I|34(l . 

The parameter y formally corresponds to an inverse temperature (3 in an ordinary statistical physics system, with 
the difference that microscopic configurations are replaced by clusters of locally optimal VCs. It can be used to control 
the mean VC density x of our artificial statistical-physics system. This is in fact done in the SP equations, as we will 
see below, y is exactly the re-weighting parameter introduced before. 

All this holds true as long as the relative VC size x — X/N is such that flQ(xN) ^> 1 for a typical random graph G, 
i.e., S(c, x) > 0. When S(c, x) < 0, a typical random graph G has no optimal VC clusters of density x. The largest 
allowable value y = y* is thus located at the point where £(c, x{c, y*)) = 0. This point also corresponds to the best 
1RSB estimate of the globally minimal VC size x(c, y*) of a typical random graph, cf. the discussion at the end of 
Sec.lVBl 



A. The cavity equation 

As already discussed, for each cluster of VCs, vertices can be decorated by a three-state variable: It assumes the 
value 1, if the vertex belongs to all VCs in the cluster, it takes the value * if it belongs to some but not all VCs, and 
the value if it belongs to no VCs of the cluster. Let us also recall the notation 7rj° \ 7rj , and tt^ (= 1 — — 7r- ) 
for the probability that vertex i takes the corresponding value in a randomly selected locally optimal VC at given y. 
The values T?i = (7rj , , tt^) fluctuate from vertex to vertex, and the main aim of the cavity method is to describe 
their distribution in a self-consistent way. 

Suppose one already knows 7?^ for each vertex i of a random graph G with N vertices. Now add a new vertex (say 
vertex 0) and connect it to k randomly chosen vertices (say j = 1,2, ... ,k) of graph G. The integer k is determined 
according to the Poisson distribution f c (k) = e~ c c k /k\. After vertex and the k edges are added, a new graph G' 
of N + 1 vertices is constructed. Under the assumption of statistical independence of the vertices j — 1,2, ... ,k in 
graph G [cf. the comment below Eq. l|3Ufl ]. one can write down the following equations for ttq: 

n(i--f) 

4 0) = 3 ~ - h , (50) 

e-v + (l-e-v) n(l-^ 7 (0) ) 

3=1 

e-»E*} 0) na-*{ 0) ) 
*p = — j=1 , (si) 

e-y + (l-e-v) n(l-^ (0) ) 

3=1 

*P - l-4 0) -4* } - (52) 



Assuming furthermore that the statistical properties of the 7? do not change drastically by adding the new vertex, 
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Eq. I|50l) allows us to write down the following self-consistent equation governing the probability distribution of it^ : 



OO k r p 

P (4 0) ) = / c (o)<L r>1 + e m n / d^ 0) ^(^ (0) : 
° k=i 1=1 <-j 



t 



\ 



n(i--[ 0) ) 



i=i 



e-y + (i-e-y) n(i-*i (0) ) 



k 

n 

1=1 



(53) 



This equation can be numerically solved with very high precision using a standard algorithm of population dynamics. 
Note also the equivalence of the update rule in the Delta function to Eq. [22 One can in fact estimate P(tt^) also 
by first generating a huge random graph, iterating SP on it and than calculating the histogram of all messages. 

B. VC density and complexity 

Also the VC density is self-averaging. When the graph size N is sufficiently large, the VC density of a typical graph 
G is almost independent of the microscopic details of G; it only depends on the statistical properties of the graph 
ensemble represented by the mean vertex degree c. At fixed value of the re-weighting parameter y, also this mean VC 
density x(c, y) can be calculated using the cavity method. The graph G' as generated in the preceding subsection has 
N + 1 vertices and mean vertex degree c' = 2 (M + k)/(N + 1) = c + (2k — c)/(N + 1). The expectation of the VC 
density of G' and that of the graph G are related by 

(N + 1) x(c\ y)=N x(c, y) + l- . (54) 
Expanding x(c,y) around c, and keeping only the non-vanishing terms in the thermodynamic limit, we obtain 

dx(c, y) 



dc 



= i - <-r> G < • 



(55) 



To obtain an expression for dx(c,y)/dc, we add a new edge between two randomly chosen vertices (say vertex j 
and I) of the old graph G and thus construct a new graph G" . This new graph has mean vertex degree c" = c + 2/N. 
Averaged over all the locally optimal VC clusters at fixed re-weighting parameter ?/, the mean increase in VC density 
due to addition of edge (j, I) is 



_ v „(o) - (o) 



1 - (1 - e-v)fi) u >*\ 



(0U(0) 



(56) 



since it results from the case that both end vertices j and I are uncovered in the corresponding cluster. In other 
words, we have 



Nx(c",y)=Nx(c,y) + 



e -v*(0>*(0) 



l-(l- e -*)7T U ^ 



(0)~(0) 



which leads to the expression 



9x{c, y) 
dc 



— u-(0) -(0) 



(1 



-v\ - (o) - (o) 



G" 



Combining Eq. (|55ll and Eq. (|58|l we finally obtain that 



x(c,y) = 1 - <7To 0) >G' - ^ 



(1 - e y)ir) >irl 



d *(°)p(*<°>)*(°> - £ / d^P^f) 



d4 0) P(^ 2 



(oh 



-«-(0)-(0) 



(57) 



(58) 



(59) 



where we have used another time the argument that the change in the histogram P(n(0)) due to vertex or edge 
addition is neglectablc in the thermodynamic limit. 
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The first line of Eq. 159fl is consistent with the analogue Eq. I|33ll for a single graph. To see this, we notice that 



e-**f *f \ _ 1 / ^ e-vnfn^ 



2 \ 1 - (1 - e-v)^ / 2 \ ^ x _ (1 _ e-.)* W . 



e -I/^(0) igiv(0\ 3 



n (i-* <0) ) 



J e -»+(l-e-») n (1-*]'") 

1 / \ - teNQ)\j 

2 \ ^ n (i-*, (0) ) 

l-(l-e »)7r, — — — — - - — m 



•3 e-y + (l-e-y) n (1-*!') 



/e - e *f n (i-^I 0) ). 

2\ e -y + (i_ e -») n (l-^ (0) ), 

»&/V(/) 

1 ,.(*) 



2<*r> ( 60 ) 

The mean complexity E(c, x) can be calculated analogously, cf. Ref. and Eqs. i|37l38|) . The final expression 
reads 

oo k \ k 

ECcar) = ^ + ^/ c (fc)n/d^^ (0) )ln(^ + (l- e ^)n(l-^ (0) )) 

fe=l (=1 

-| / d*f P(*f ) / d4 0) P(4 0) ) - (1 - e-y)^) . (61) 



C. Optimal re- weighting and minimal VC density 

At given average degree c, Eq. (|59|l allows to calculate the typical VC size as a function of the re- weighting parameter 
y. It is monotonously decreasing with growing y, so naively one would expect that the minimal VC size is found in 
the limit of y — > oo. There is, however, a problem: The complexity E(c, y) reaches zero at some (a priori) c-dependent 
value y* , and becomes negative for larger y. Being defined as the logarithm of the number of corresponding clusters, 
negative complexities correspond to VC sizes typically non-existing in random graphs of mean degree c. Consequently, 
we have to determine the size of the minimal VC of a typical graph by x(c,y*) from the zero-complexity criterion 
E(c,y*)=0. 

Figure [7] shows E(c, y) as a function of y at various fixed c values. At given c value (c > e), the complexity 
E(c, y) first increases with y as y increases from zero. E(c, y) attains its maximal value when y increases to y f=a 1.5. 
Afterwords, E(c, y) decreases with y and it reaches E = when y = y* « 3.1. Upon further increase of y, the 
complexity becomes negative. It is remarkable that the E(c, y) curves for different c values intersect at (almost) the 
same point y* , which is just the point where the complexity vanishes, E(c, y*) = 0. At present we do not understand 
why the complexities for systems with different c values should approaches zero at (almost) the same point. 

At each fixed c value, the optimal y* value can be determined from the point of E(c, y*) = 0. The optimal y* 
value was calculated numerically by population dynamics and shown in Fig. [S] as a function of mean vertex degree 
c. Figure |H1 indeed demonstrates that the optimal re- weighting parameter y* is insensitive to c and stays at y* « 3.1 
over the whole range of inspected c values. This is also in agreement with Ref. |19| (note that y* in the present article 
corresponds to 2y* in 0|). Even when c = 2.8 (just slightly beyond e) we have y* = 2.9 ± 1.2, which is significantly 
different from zero, but consistent with a constant y*. From Fig. [HI we thus get the impression that, as the mean 
vertex degree c exceeds e, the optimal re-weighting parameter jumps quickly to a value y* ~ 3.1. 

The minimum vertex cover size can also be obtained. In Fig. we show the relationship between the minimal 
vertex cover size and the mean vertex degree c. As a comparison, Fig. |5| also includes the mean minimal vertex cover 
size as estimated by the SP algorithm of the last section (N — 5000, y = 2.0, each point averaged over 20 samples). 
The results obtained by SP and those obtained by the mean-field statistical physics calculations are in very good 
agreement. At given vertex degree c, the minimal VC density estimated by SP and the mean-field theory is lower 
than the corresponding value obtained through exact enumeration followed by extrapolation |l7j | . The reason for such 
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FIG. 7: (Color online) Complexity S(c, y) as a function of the re-weighting parameter y for fixed mean vertex degree c = 12.0 
(circles), c = 10.0 (squares), c = 8.0 (diamonds), and c = 6.0 (triangles). All these curves seem to intersect in two points, y = 
and y — y* w 3.1. In these two points, E(c, y) — 0. 




mean vertex degree c 



FIG. 8: Relationship between the optimal re-weighting parameter y* and the mean vertex degree c. 



a discrepancy can be understood. According to Fig. [7] at given mean vertex degree c < 10, the maximum complexity 
of the system is less than 5 x 10~ 3 . This indicates that clustering of minimal VC solutions into distantly separated 
domains will only occur for random graphs with size N > 10 3 . For small random graphs as used in Ref. 17], it is 
very likely that all the minimal VC solutions can be grouped into a single cluster (but with long-range frustrations 
among those vertices described by the joker state * |20j1- 

To summarize this subsection, we list in Table[I]the values of y* and the minimal energy density at several c values. 
Theoretical and SP results are extremely close to each other, even if the latter are systematically slightly larger. This 
may be due to various reasons: SP uses finite size and really constructs a - possibly non-optimal - VC, whereas 
the theory works at the 1RSB level which again is not exact due to higher RSB effects. We expect, however, both 
estimates to be very close (but slightly different) from the exact result. 



22 



i ■ 1 ■ r 




j i i i i i i 

4 6 8 10 

mean vertex degree c 



FIG. 9: (Color online) The fraction of covered vertices (VC density) x in a minimum vertex cover problem as a function of 
mean vertex degree c. Typical-case statistical physics results are given by a +, and x gives the estimates made by SP on graphs 
of size N = 5000, averaged over 20 samples. 

TABLE I: The optimal re-weighting parameter y* and the minimal VC density x as estimated by the 1RSB ansatz and by SP 
(N = 5000). 



c 


2.8 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 


y* 


3(1) 3(1) 3.1(1) 3.11(4) 3.10(2) 3.08(2) 3.07(1) 3.069(9) 3.068(8) 


x (theory) 


0.4536290(8) 0.46632(2) 0.51934(2) 0.56033(3) 0.59341(3) 0.62088(2) 0.64416(2) 0.66423(2) 0.68175(2) 


x (SP) 


0.4661(5) 0.52004(4) 0.5607(3) 0.5942(3) 0.6214(3) 0.6453(2) 0.6655(3) 0.6834(2) 



D. Relaxation time of the population dynamics 

Let us finally analyze the mean-field population dynamics which aims at finding a fixed-point distribution for 
Eq. (|53[l . In the population dynamics, an array of Af values 7r(°) is first initialized randomly (typically we use 
J\f r~j 10 6 , this number should not be confused with the vertex number TV in the single sample analysis of the previous 
sections). Then in each time step, corresponding to an interval At = 1/OV, we perform the following update of the 
population: 

(1) A natural number k is drawn from the Poisson distribution / c (fc). 

(2) k elements in , i = 1, k, are randomly and independently chosen in the current population. 

(3) A new tt^ is calculated according to Eq. I|5U|) . 

(4) A randomly selected element in the population is replaced with this new 7r' ' value. 

This iteration is repeated many times (typically of the order of 10 4 AT), until the statistical properties of the population 
approach stationary values. The histogram of the population is then our estimate of the self-consistent distribution 
P(tto) in Eq. 

Suppose that, at time t, the histogram of Tr^ -* over the whole population is given by 

P(n^;t)=p 1 (t)S(^)+p 2 m^ 0) )+P3(t)P^ i0) ;t) , (62) 
with P3(t) — 1 — Pi{t) — P2{t), and with p(ir^ ; t) satisfying the conditions 

p(0;t)=0, p(l;t) = 0, f dn^p(n^;t) = 1 . (63) 

Jo 
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Since the population of tt' ) values at time t + At is obtained by replacing one randomly chosen element of the 
population of time t with the newly calculated n^°\ we can write down the following two evolution equations for pi(t) 
and p 2 {t) : 

M Pl (t + At) = M Pl (t)+jrf c (k)[l-(l-p 2 (t)) k }- Pl (t) -> *£M = i- e -*»V)_ pi(jt) (64) 

fe=i 

Afp 2 (t + At) = M P2 {t) + Y j f c {k) P 1-p 2 {t) -> ^M = e -^-^- P2 (t) (65) 



fe=i 



More precisely, these equations describe the average evolution over many runs of the population dynamics. For large 
populations, J\f S> 1, the true evolution of one population is, however, expected to be closely concentrated around 
its expectation value, with random fluctuations of the order 0(l/yjf). These equations can be understood easily: 
In Eq. I|64|) , we describe the expected number of zero-elements of the population. This number is decreased by one 
with probability pi(t) by replacing an old zero element, or it grows by one if a new zero element is introduced. The 
latter case happens if in between the k "parents" Tif > \ i — l,...,fc, selected before, there exists at least one which 
equals one. Analogously, a new element equal to one is inserted in the population if all parents were equal to zero, 
explaining the gain term in Eq. (|65(l . 

The fixed-point solution (pi,p 2 ) of Eqs. i|64|) and H65[) is determined by 

pi = 1 - e~ cp2 , p 2 = e - c{1 - pi) . (66) 

Note that for c < e, only one solution with p\ +p 2 = 1 exists. This solution corresponds to replica-symmetry, only one 
solution cluster exists, and consequently no cluster-to-cluster fluctuations exist. Above c = e, also two other solutions 
with px +p 2 7^ 1 exist. Only one is iterationally stable, it fulfills pi +p 2 < 1 and allows therefore for cluster-to-cluster 
variations of the value of for some vertices i. 

To study the convergence velocity toward this fixed-point solution, we assume that 

Pl (t)= Pl +e 1 (t) , p 2 (t) = P2 + e 2 (t) , (67) 

where ei(t) and e 2 (t) are small quantities. Linearizing the dynamical equations, we find 

^l = c(l-p 1 )e 2 (t)-e 1 (t) , ^-= cP2 e 1 {t)-e 2 {t) , (68) 
and the typical relaxation time for p\ (t) and p 2 (t) is 

ru = * • (69) 

When the mean vertex degree c approaches e from below, the parameter p 2 approaches e~ 1 as 

+ <™> 

Consequently, the typical relaxation time T\ 2 diverges as 

2e 

T i2 ~ f° r c < e . (71) 

e — c 

Note that the same critical behavior was found for the bug relaxation time in the purely replica-symmetric warning- 
propagation equations. 

On the other hand, when c approaches e from above, 

1 / 6(c-e) vi/2 (c-e) 

therefore ri2 diverges as 

7"i2 ~ , for c > e . (73) 
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Equation (|71|) and Eq. I)73|l were confirmed in single-graph message passing experiment. Note that the only region 
where this convergence slows down is close to the replica-symmetry breaking transition at c = e, where also the 
population dynamics slows down critically. Note also, that this relaxation time does not depend on the re-weighting 
parameter y. 

We now study the evolution of p{%^;t) in Eq. (|(j2|) . For this purpose, in the population dynamics we can set 
Pi(t) = p\, P2(t) = P21 P3(t) = P3 = 1 — Pi ~~ P2 to their stationary values, and store only those values that 
satisfy < < 1 in the population array. The distribution p(jr^ ; t + At) is related with p(ir^;t) by the following 
equation, describing the expected number of population entries in the interval (tt^°\ tt (0) + A7r(°)): 



r (o) 



ATpVtW ; t + At)Afr^> = Afp(^°> ; t) Att {[) > - p(n^> ; t)Aw 

m=l 1=1 J ° 



From Eq. 17411 we see that 

dpQtWjt) , A(m „ , ^ / cP3 (m) m * 
dt 



r (o) 



fl(l-^)) 

i=i 



e-y + (i-e-y) n(i-^ (0) ) 

1=1 



Att(°) 



rn—1 



1=1 



The fixed-point solution of Eq. is 



oo j, / \ m „ 

p(* w> )=E^^n/ d ^M* 



e _cp3 

m=l * i = l 



(0) _ 



fl(l-^ 0) ) 
r (0) \=± 

e-^ + (l- e ^)n(l-^ (0) ) 
i=i 



i=i 

e -v + (l- e -»)n(l-^) 
i=i 



(74) 



(75) 



(76) 



as can be seen also directly from Eqs. (|53() and l|62|) . 

Now let us suppose that, at time t, the actual distribution p(tt^; t) deviates from the fixed-point distribution only 
slightly: 



p(^;t)=p(^)+e 3 (^;t) 
with |e 3 (7r(°);f)| < 1 for all < tt^ < 1 and 



e 3 (0;i) = e 3 (l;i) = 
The linearized evolution equation for e 3 (7r(°); f) is 
-e 3 (n^;t) + cp 2 



d7r(°'e 3 (7r(°);t) = 0. 



(77) 



(78) 



1 - 7r(°) 



(1 + (e» - 1)^(°)) 2 " £3 U + (e» - 1)tt(°) : 

^(l-4 0) ) 



;*) 



+c P3 dftj^(*f>;t) / dn^p(^)s(n^-l + 

Jo Jo v 1 - (1 



-(l- e -*)(l-*<°>)(l-*C°>) 



The stability of Eq. Q79JI can be analyzed by Fourier-expanding €3(7^°); £) in the following way: 

00 

e 3 (7r( 0) ;t) = ^ a m (t)^sin(7rm^ (0) ) , 



rn — 1 



with coefficients a m (t) satisfying the global constraint 



E 

n=0 



G2n+l(i) 

2n + 1 



(79) 



(80) 



(81) 
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Based on Eq. I|79|) and Eq. I|8U|1 . one can write down the evolution equation for a m (t): 

da m {t) ^ 



dt ^ 



AmnOnW , (82) 



where the elements of the matrix A can be easily written down. 

The task now is to identify the dominant eigen mode of Eq. (|82|l under the constraint of Eq. (|81|l . We have 
performed such an analysis for various c values in the range 3 < c < 30 and various y values ranging from y — to 
y = 5. In all the cases studied, the dominant eigen mode of Eq. I|82|) decays to zero very quickly, indicating that the 
mean- held population dynamics is exponentially fast converging toward its hxed point. Compared to the iterative 
stability of SP on single instances of random graphs, we hnd that the messages may converge in population even 
if they do not converge on the single graph any more. This is interesting since it allows to extend the typical-case 
estimates to a region, where SP applied to single samples fails to predict anything. 



VII. CONCLUSION 



In this paper, we have formulated two message passing procedures for solving - or approximating - the minimal 
vertex cover problem, namely warning propagation and survey propagation. We have analyzed the performance of 
both algorithms on the test bed of hnite-connectivity random graphs, where previous statistical-physics approaches 
based both on the replica approach and on the cavity method provide an insight on the phase diagram. We have also 
discussed in detail how the message-passing approach is technically connected to these typical-case based statistical- 
physics results. 

For small average vertex degrees c < e replica symmetry is known to hold in the space of all minimal vertex covers. 
Therefore the simpler one of the two algorithms - warning propagation, which is based on the replica-symmetric 
Bethe-Peierls iterative approach - is applicable. Comparing it to the exact leaf-removal algorithm, we have shown 
that it outputs true minimal vertex covers. Unfortunately the iterative solution of the warning propagation equations 
slows down critically if we approach c — e from below, and it does not converge at all for higher average degrees. 

In this higher-degree region, replica symmetry is known to be broken. We have therefore applied a survey propa- 
gation algorithm which is based on the hrst step of replica symmetry breaking. We have identified a parameter range 
where the message passing equations converge to a globally self-consistent solution, which can be used to construct 
small vertex covers. We have found that the provided results not only outperform simple local-search procedures, but 
are consistent to exact asymptotic results for high but finite average degrees. Interestingly the vertex covers produced 
at the end are relatively insensitive to details of the algorithms (in particular to the somewhat heuristic choice of the 
re- weighting parameter y). 

In the case of vertex cover, replica symmetry is known to be fully broken, i.e. the exact solution for c > e is known 
to be more complicated than the one described by one-step replica-symmetry breaking. Intuitively the solutions are 
expected to be organized in clusters of clusters of clusters etc. So, even if the results of the application of survey 
propagation are very promising, it is expected to provide only a good approximation algorithm. It could therefore 
be interesting to go beyond survey propagation and to formulate an algorithm based on the second step of replica 
symmetry breaking (corresponding to two hierarchical clustering levels), in order to see if the higher complexity of 
the algorithm required is leading to even smaller vertex covers than survey propagation. 

As a last point, it could be interesting to apply the algorithm to real- world covering problems, possibly extending 
it to the specific nature of these tasks, which may be similar but not equal to the original minimal VC problem 
These problems are frequently characterized by a broad degree distribution of the underlying networks. Their extreme 
heterogeneity may result in a better performance of simple heuristic algorithms exploiting local network structures. 
On the other hand, it was shown in 3] that assortative degree correlations may force replica-symmetry to break also 
in scale-free networks, and algorithms like survey propagation are expected to become efficient. A related interesting 
question is the local network structure beyond the vertex degrees, in particular small loops or other small dense 
subgraphs. It may be practically necessary to coarse grain the graph considering such loops as single constraints, and 
by applying the factorization hypothesis only to larger structures. In the replica symmetric case (warning or belief 
propagation) , this corresponds to the region-graph method proposed in |12| , for the one-step replica symmetry broken 
case (survey propagation) it is still an open technical challenge. 



26 

Acknowledgments: We gratefully acknowledge discussions W. Barthel, A.K. Hartmann and G. Semerjian. 



[1] M. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San 
Francisco, 1979). 

P. Echenique, J. Gomez-Gardenes, Y. Moreno, and A. Vazquez, Phys. Rev. E 71, 035102(R) (2005). 
A. Vazquez and M. Weigt, Phys. Rev. E 67, 027101 (2003). 

Y. Brcitbart, C. Chan, M. Garofalakis, R. Rastogi, and A. Silverschatz, Proc. IEEE INFOCOM (2001). 
K. Park and H. Lee, Proc. ACM SIGCOM (2001). 

J. Gomez-Gardenes, P. Echenique, and Y. Moreno, Eur. Phys. J. B49, 259 (2006). 

E. Marinari, R. Monasson, and G. Semerjian, Europhys. Lett. 73, 8 (2006). 

A. Hartmann and M. Weigt, Phase Transitions in Combinatorial Optimization Problems (Wiley- VCH, Berlin, 2005). 

A. Percus, G. Istrate, and C. Moore, eds., Computational Complexity and Statistical Physics (Oxford University Press, 
New York, 2006). 

M. Mezard and G. Parisi, Eur. Phys. J. B 20, 217 (2001). 
M. Mezard, G. Parisi, and R. Zecchina, Science 297, 812 (2002). 
J. Yedidia, W. Freeman, and Y. Weiss, MERL Technical report TR-2001-22 (2002). 

B. Bollobas, Random Graphs (Academic Press, London, 1985). 
A. Frieze, Discr. Math. 81, 171 (1990). 
P. Gazmuri, Networks 14, 367 (1984). 

M. Weigt and A. K. Hartmann, Phys. Rev. E 63, 056127 (2001). 
M. Weigt and A. K. Hartmann, Phys. Rev. Lett. 84, 6118 (2000). 
M. Bauer and O. Golinelli, Eur. Phys. J. B 24, 339 (2001). 
H. Zhou, Eur. Phys. J. B 32, 265 (2003). 
H. Zhou, Phys. Rev. Lett. 94, 217203 (2005). 

G. Biroli and M. Mezard, Phys. Rev. Lett. 88, 025501 (2002). 
A. Hartmann and M. Weigt, Europhys. Lett. 62, 533 (2003). 

M. P. Ciamarra, M. Tarzia, A. de Candia, and A. Coniglio, Phys. Rev. E 67, 057105 (2003). 
O. Rivoire, G. Biroli, O. Martin, and M. Mezard, Eur. Phys. J. B 37, 55 (2004). 

H. Hansen-Goos and M. Weigt, J. Stat. Mech. p. P04006 (2005). 

A. Braunstein, R. Mulet, A. Pagnani, M. Weigt, and R. Zecchina, Phys. Rev. E 68, 036702 (2003). 
A. Braunstein, M. Mezard, M. Weigt, and R. Zecchina (2006). 

F. Kschischang, B. Frey, and H. Loeliger, IEEE Trans. Info. Th. 47, 498 (2001). 
R. Karp and M. Sipser, Proc. 22nd Ann. IEEE Symp. Found. Comp. Sci. p. 364 (1981). 
H. Zhou and Z. Ou-Yang, e-print: cond-mat/0309348 (2003). 
L. Zdeborova and M. Mezard, J. Stat. Mech. p. P05003 (2006). 
S. Mertens, M. Mzard, and R. Zecchina, Rand. Struct. Alg. 28, 340 (2006). 
M. Mezard and R. Zecchina, Phys. Rev. E 66, 056126 (2002). 

M. Mezard, G. Parisi, and M. Virasoro, Spin Glasses and Beyond (World Scientific, Singapore, 1987). 
R. Monasson, Phys. Rev. Lett. 75, 2847 (1995). 
A. Montanari, G. Parisi, and F. Ricci, J. Phys. A 37, 2073 (2004). 
M. Weigt, Eur. Phys. J. B 28, 369 (2002). 

G. Semerjian and E. Marinari, J. Stat. Mech. p. P06019 (2006). 



